智能论文笔记

Domain Generalization -- A Causal Perspective

Paras Sheth , Raha Moraffah , K. Selçuk Candan , Adrienne Raglin , Huan Liu

分类：机器学习

2022-09-30

Machine learning models rely on various assumptions to attain high accuracy. One of the preliminary assumptions of these models is the independent and identical distribution, which suggests that the train and test data are sampled from the same distribution. However, this assumption seldom holds in the real world due to distribution shifts. As a result models that rely on this assumption exhibit poor generalization capabilities. Over the recent years, dedicated efforts have been made to improve the generalization capabilities of these models collectively known as -- \textit{domain generalization methods}. The primary idea behind these methods is to identify stable features or mechanisms that remain invariant across the different distributions. Many generalization approaches employ causal theories to describe invariance since causality and invariance are inextricably intertwined. However, current surveys deal with the causality-aware domain generalization methods on a very high-level. Furthermore, we argue that it is possible to categorize the methods based on how causality is leveraged in that method and in which part of the model pipeline is it used. To this end, we categorize the causal domain generalization methods into three categories, namely, (i) Invariance via Causal Data Augmentation methods which are applied during the data pre-processing stage, (ii) Invariance via Causal representation learning methods that are utilized during the representation learning stage, and (iii) Invariance via Transferring Causal mechanisms methods that are applied during the classification stage of the pipeline. Furthermore, this survey includes in-depth insights into benchmark datasets and code repositories for domain generalization methods. We conclude the survey with insights and discussions on future directions.

translated by 谷歌翻译

超越地球轨道的人类空间勘探将涉及大量距离和持续时间的任务。为了有效减轻无数空间健康危害，数据和空间健康系统的范式转移是实现地球独立性的，而不是Earth-Reliance所必需的。有希望在生物学和健康的人工智能和机器学习领域的发展可以解决这些需求。我们提出了一个适当的自主和智能精密空间健康系统，可以监控，汇总和评估生物医学状态;分析和预测个性化不良健康结果;适应并响应新累积的数据;并提供对其船员医务人员的个人深度空间机组人员和迭代决策支持的预防性，可操作和及时的见解。在这里，我们介绍了美国国家航空航天局组织的研讨会的建议摘要，以便在太空生物学和健康中未来的人工智能应用。在未来十年，生物监测技术，生物标志科学，航天器硬件，智能软件和简化的数据管理必须成熟，并编织成精确的空间健康系统，以使人类在深空中茁壮成长。

translated by 谷歌翻译

空间生物学研究旨在了解太空飞行对生物的根本影响，制定支持深度空间探索的基础知识，最终生物工程航天器和栖息地稳定植物，农作物，微生物，动物和人类的生态系统，为持续的多行星寿命稳定。要提高这些目标，该领域利用了来自星空和地下模拟研究的实验，平台，数据和模型生物。由于研究扩展到低地球轨道之外，实验和平台必须是最大自主，光，敏捷和智能化，以加快知识发现。在这里，我们介绍了由美国国家航空航天局的人工智能，机器学习和建模应用程序组织的研讨会的建议摘要，这些应用程序为这些空间生物学挑战提供了关键解决方案。在未来十年中，将人工智能融入太空生物学领域将深化天空效应的生物学理解，促进预测性建模和分析，支持最大自主和可重复的实验，并有效地管理星载数据和元数据，所有目标使生活能够在深空中茁壮成长。

translated by 谷歌翻译

由于数据隐私问题，人类的医疗数据可能具有挑战性，难以进行某些类型的实验，或禁止的相关成本。在许多设置中，可以获得来自动物模型或体外细胞系的数据，以帮助增加我们对人类数据的理解。然而，与人类数据相比，该数据已知具有低病因有效性。在这项工作中，我们使用体外数据和动物模型增强了小型人类医疗数据集。我们使用不变的风险最小化（IRM）来阐明通过考虑属于不同数据生成环境的交叉器件数据来阐明不变的功能。我们的模型识别与人类癌症发展相关的基因。我们观察到不同于使用的人和小鼠数据的数量之间的一致性，但是需要进一步的工作来获得结论性见解。作为次要贡献，我们增强了现有的开源数据集，并提供了两个均匀加工，交叉生物的同源基因匹配的数据集。

translated by 谷歌翻译